Instructor: Zahra Shakeri– Fall 2023
Course Project-Phase #2


Project Description and Instructions

The course project for CHL5230H is divided into three main parts, concluding with a final presentation and report. Two of these parts will function as the main datathons in the course, and your performance will be assessed in both the context of a datathon and as part of your ongoing project. The second phase of your project will be recognized as Datathon #3, signifying the start of your data analysis journey. The list of all teams and their choice of the project’s dataset can be found here. Please review the list to confirm the accuracy of your information. Your team number for this datathon corresponds to the number listed in this form. Please note that this Datathon does not require a presentation submission. However, both the code and the report must be submitted as the primary deliverables for this assignment.

Instructions for Submission

In this Datathon, you will need to submit two main components: a low-fidelity prototype and a written report. As this is the early stage of your project’s progress, there is no presentation component required for this datathon.

1. Low-fidelity Prototype (In-class Submission)

The first phase of this Datathon involves collaborative efforts among students, aimed at transforming the provided datasets into actionable insights. Teams should formulate research questions and outline their data analysis plans, followed by submitting a low-fidelity prototype of their solution to Assignments/Course Project/Project Phasee #2/Low-fidelity Prototype. Please adhere to the naming convention outlined later in this document when naming your one-page PDF submission for today.

Every team is required to submit their low-fidelity prototype through Quercus by 8:00 PM on October 17, 2023. A successful submission should include a clear and legible list of research questions that you plan to address using the dataset that you are planning to use for your course project. Additionally, provide a detailed plan specifying the analysis methods (e.g. machine learning) you intend to employ for addressing these questions. Ensure that each research question corresponds to its respective analysis plan.

Please note that you are not obligated to finalize your solution or research questions at this stage. If you come up with a better idea during the week, feel free to update your plan. The primary goal of the low-fidelity milestone is to initiate the brainstorming phase of a data science project, which is typically the initial and most critical phase. It allows you to see how the project’s direction may evolve during your analysis.

2. A High-fidelity Prototype

For this project phase, due by October 31st at 11:59pm, you need to submit a PDF file covering the following key points:

  • Introduction: Start by explaining the two to three main research questions of your project and why each question is important. We would recommend structuring the introduction as: Describe the problem you want to explore, explain its significance and why it needs investigation, present your research questions aimed at addressing these problems, and clarify how answering these questions can help solve those problems. At this stage, it is important to do some literature review to justify the importance of your work and the need for your proposed research questions. You should review at least 10 relevant research studies, whether clinical or technical, to ensure you have a good understanding of the field you are exploring or contributing to in this project.

  • Methods: Provide detailed information about your dataset, the data exploration phase, and the implementation of one predictive model, as covered before/on October 31st.

  • Results: Detail the results you have achieved so far. Keep in mind that this section will be updated in (at least) two more iterations.

  • Discussion: Discuss the results, your exploratory process, and the key findings of your work up to this point. Address any limitations you have encountered and outline your plan to address them. Also, provide a brief paragraph outlining the next steps in your work.

  • Individual Contributions: Highlight the contributions of each team member throughout the entire process.

  • Code and Presentation: Store your Datathon materials, including notebooks and datasets, on GitHub. Share the GitHub project link in the report for easy access by the TA.

Note: Please ensure that the length of this submission, including references and figures, does not exceed four pages. The TA will not evaluate pages beyond the fourth.

A single submission per team is acceptable for this phase.

Important Dates

Component Due Time Where to Submit?
Low-fidelity Prototype October 17, 8:00 pm Assignments/Course Project/Project Phasee #2/Low-fidelity Prototype
Written Report October 31, 11:59 pm Assignments/Course Project/Project Phasee #2/Written Report